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Abstract 

The game-theory field of Collective INtelligence (COIN) concerns the design of computer- 
based players engaged in a non-cooperative game so that as those players pursue their self- 
interests, a pre-specified global goal for the collective computational system is achieved as a 
side-effect” . Previous implementations of COIN algorithms have outperformed conventional 
techniques by up to several orders of magnitude, on domains ranging from telecommunica- 
tions control to optimization in congestion problems. Recent mathematical developments 
have revealed that these previously developed algorithms were based on only two of the three 
factors determining performance. Consideration of only the third factor would instead lead 
to conventional optimization techniques like simulated annealing that have little to do with 
non-cooperative games. In this paper we present an algorithm based on all three terms at 
once. This algorithm can be viewed as a way to modify simulated annealing by recasting it 
as a non-cooperative game, with each variable replaced by a player. This recasting allows 
us to leverage the intelligent behavior of the individual players to substantially improve the 
exploration step of the simulated annealing. Experiments are presented demonstrating that 
this recasting significantly improves simulated annealing for a model of an economic process 
run over an underlying small-worlds topology. Furthermore, these experiments reveal novel 
small-worlds phenomena, and highlight the shortcomings of conventional mechanism design 
in bounded rationality domains. 


1 INTRODUCTION 

There are three general types of distributed systems that are found in nature and that researchers 
have translated into computational algorithms for function maximization. The first is exemplified 


1 


by Neo-Darwinian natural selection, which has been translated into genetic algorithms (GA’s) [1, 
7, 13, 24]. These distributed systems can be viewed as finding a maximum of a function G that 
takes as argument any single one of the system’s variables. (Each of those single variables is a 
“genome”, with G of a genome being the “fitness” of the “phenotype” induced by that genome.) 

Whereas systems of this first type have a “narrow G” , in the second type of distributed system, 
the function G being optimized is “wide”, taking the state of the entire distributed system as its 
argument. In some such distributed systems it is only in the crudest sense that the individual 
variables can be viewed as players in a non-cooperative game. These systems comprise the second 
type, and examples include simulated annealing (SA [18, 11]) and swarm intelligence [2, 20], 
inspired by spin relaxation in physics and eusocial insect colonies, respectively. 

In the third type of system, G is also wide, but the value of each of the individual variables 
going into G is set by a player engaged in an over-arching non-cooperative game where each player 
V is trying to maximize its associated payoff utility function g v . Roughly speaking, such collective 
systems work when the utility functions of the individual variables/players are all “aligned” with 
the world utility G. Under these circumstances, as the individual players pursue their self- 
interests, the global goal for the full collective of maximizing G is achieved “as a side-effect”. 
The primary naturally-occurring instances of such collectives are economic institutions where the 
players are human beings, e.g., auctions and clearing of markets. In the computational versions 
of such systems the players are instead computer programs [4, 5, 6, 12, 15, 17, 19, 30, 36, 48]. 

The “Collective INteiligence” (COIN) framework concerns the design of such collectives in- 
volving non-cooperative games. In particular, it addresses the issue of how to generate, from a 
provided G, the set of utilities that have optimal signal/noise for each player r) while also 
having the property that as the individual players maximize those utilities, G gets maximized 
(i.e., while also being “aligned with G”). This work on design of collectives can be viewed a s an 
extension of mechanism design [10] beyond human economics, to include concern for the signal- 
to-noise ratio in the payoff functions and off-equilibrium behavior, to allow far more freedom 
in choice of the g v than exists with human players (for example to encompass computational 
systems in which the issue of incentive compatibility is moot), and also to encompass arbitrary G 
and arbitrary dynamics of the system. Applications of this framework on problems from routing 
in telecommunication networks [37, 44, 46] to congestion problems [47] have resulted in sub- 
stantial performance improvement over conventional techniques that do not consider issues like 
signal- to-noise. Typically as the size of the collective grows, such improvements reach several 
orders of magnitude. 

Recent mathematical developments have shown that the previously developed COIN algo- 
rithms for design of collectives were based on only two of the three factors determining perfor- 
mance at maximizing G. Consideration of only the third factor would instead lead to conventional 
wide-G systems that have little to do with non-cooperative games, like simulated annealing. Con- 
sideration of all three terms at once therefore would result in an algorithm that combines the 
two types of wide-G function maximization systems, those having naturally-occurring analogues 
of human economics and of statistical physics, respectively. 

In this paper we present such a hybrid algorithm. Because of the similarity of this algo- 
rithm to (certain aspects of) how human corporations are run, we call it the Computational 
Corporation (CoCo) algorithm. Roughly speaking, it works by modifying the exploration step 
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of simulated annealing by having the new values of the variables be set by the moves of intelli- 
gent players in a non-cooperative game rather than by sampling a probability distribution. Like 
simulated annealing, the computational corporation algorithm is intended not to give the best 
possible performance in all problem domains — an algorithm laboriously tailored for a particular 
domain will invariably perform best for that domain [45]. Rather like other algorithms related 
to naturally-occurring distributed function maximizers the computational corporation algorithm 
is intended as a powerful and broadly applicable “off-the-shelf’ algorithm. 

In other work we present experiments demonstrating that the computational corporation algo- 
rithm outperforms simulated annealing by several orders of magnitudes for spin glass relaxation 
and bin-packing [43] . Here we present such experiments for a simple economics model of a set 
of people choosing among various potential formats for their home music-reproduction systems. 
In this model the players interacted over an underlying ring-like network. The world utility was 
the sum of each player 77’s “happiness” function, which depended on factors like which of 77’s 
neighbors picked the same format it did, 77’s intrinsic preference for the format it picked, and the 
price, set by the level of global demand, of any reproduction in that format. In our experiments 
we compared C0C0 to SA and also to a variant of a conventional COIN algorithm that is similar 
to the economics/ mechanism design technique of providing incentives to players that “endoge- 
nize their externalities”. It was found that when that network had only short-range connections, 
the performance of C0C0 at a certain iteration substantially exceeded that of the economics- 
like COIN, verifying the sub-optimality of such conventional economics incentive schemes (world 
utility equals 68 and 60, respectively, in arbitrary units, at a fixed time-step). In turn, both that 
COIN and C0C0 system (the two game-theory-based systems) performed substantially better 
than SA ( G = 44). Performances were also consistent with the spin-glass and bin-packing con- 
vergence results reported in [43], namely that C0C0 would reach a given level of G about two 
orders of magnitude more quickly than SA. 

We then modified the network by incorporating a few random long-range connections, to 
produce a small- worlds-network [21, 26, 27, 28, 39]. It was found that the resultant decrease in 
average inter-player distance resulted in a barely significant improvement in CoCo’s performance 
(3%), and none in the other algorithms, contrary to typical results in the small- worlds literature. 
However if G was also changed, to reflect the total number of other players within a given 
player 77’s full neighborhood (rather than just 77’s nearest neighbors), then going to a small- 
worlds network improved performance substantially (10%). Note that the fact that all three 
algorithms experienced this improvement to some degree suggests that a variant of the small- 
worlds phenomenon generically extends beyond domains where it has previously been investigated 
to maximization of high-dimensional functions defined over a network. 


2 The Mathematics of Collective Intelligence 

The full formalization of the COIN framework extends significantly beyond what is needed for 
this paper. 1 The restricted version needed here starts with an arbitrary vector space Z whose 

x That framework encompasses, for example, arbitrary dynamic redefinitions of the players (i.e., dynamic reas- 
signments of how the various subsets of the variables comprising the collective are assigned to players), as well as 
modification of the players’ information sets (i.e., modification of inter-player communication). See [42]. 
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elements £ give the joint state of all the variables in the collective. 

We wish to search for the £ that maximizes the provided world utility G. In addition to G 
we are concerned with payoff utility functions {<7^}, one such function for each variable/ player 
7 ], We use the notation g to refer to all players other than rj. 

We will need to have a way to “standardize” utility functions so that the numeric value they 
assign to a £ only reflects their ranking of £ relative to certain other elements of Z. We call such 
a standardization of utility U for player 77 the “intelligence for 77 at £ with respect to U n . Here 
we will use intelligences that are equivalent to percentiles: 


eu(C :v)=J (C')0[t/(C) - !/«')] , (1) 

where the subscript on the (normalized) measure indicates it is restricted to £' sharing the same 
non-77 components as £, and where the Heaviside function 0 is defined to equal 1 when its 
argument is greater than or equal to 0, and to equal 0 otherwise. Intelligence values are always 
between 0 and 1. 

Our uncertainty concerning the behavior of the system is reflected in a probability distribution 
over Z. Our ability to control the system consists of setting the value of some characteristic of 
the collective, e.g., setting the payoff functions of the players. Indicating that value by s , our 
analysis revolves around the following central equation for P(G | s), which follows from Bayes 1 
theorem: 


P(G | s) = J dt G P{G | ?g>s) j de g P(e G | e g ,s)P(e g \ s ) , ( 2 ) 

where e g = (£ : 771 ),e 5TJ2 (£ : %),•••) is the vector of the intelligences of the players with 

respect to their associated payoff functions, and e G = (cg(£ ' Vi ) : 772), * * *) is the vector of 
the intelligences of the players with respect to G. 

Note that e 9r) (£ : 77) = 1 means that player r\ is fully rational at £, in that its move maximizes 
its payoff, given the moves of the players. So a Nash equilibrium is a point £ where e 9r} (£ : 77) = 1 
for all players 77. 2 On the other hand, a £ at which all components of e G = 1 is a local maximum 
of G (or more precisely, a critical point of the G(£) surface). 

If we can choose s so that the third conditional probability in the integrand is peaked around 
vectors e g all of whose components are close to 1 , then we have likely induced large (payoff 
function) intelligences. If we can also have the second term be peaked about t G equal to e 5 , then 
e G will also be large. Finally, if the first term in the integrand is peaked about high G when e G 
is large, then our choice of s will likely result in high G, as desired. 

Intuitively, the requirement that payoff functions have high “signal- to-noise” (an issue not 
considered in conventional work in mechanism design) arises in the third term. It is in the second 
term that the requirement that the payoff functions be “aligned with G” arises. Previously 

2 Consideration of points £ at which not all intelligences equal 1 provides the basis for a model-independent 
formalization of bounded rationality game theory, a formalization that contains variants of many of the theorems 
of conventional full-rationality game theory. See [41]. 
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developed COIN algorithms concentrated on these two terms. In contrast, non-game-theory 
function maximization techniques like simulated annealing instead address how to have term 1 
have the desired form. They do this by trying to ensure that the local maxima that the underlying 
system settles on have high G . 

It is the simultaneous concern for all three of the terms in Eq. 2 that underlies the CoCo 
algorithm. To present that algorithm we must first review some COIN results on how to simul- 
taneously set terms 2 and 3 to have the desired form. 

Our desired form for the second term in Eq. 2 is assured if the collective be factored, which 
means that t g equals cq exactly for all £. In game-theory language, the Nash equilibria of a fac- 
tored collective are local maxima of G. In addition to this desirable equilibrium behavior, factored 
collectives also automatically provide appropriate off-equilibrium incentives to the players. 

As a trivial example, any “team game” in which all the payoff functions equal G is factored [9, 
25]. However team games often have very poor forms for term 3 in Eq. 2, forms which get 
progressively worse as the size of the collective grows. This is because for such payoff functions 
each player 77 will usually confront a very poor “signal-to-noise” ratio in trying to discern how its 
actions affect its payoff g n = G, since so many other player’s actions also affect G and therefore 
dilute 77’s effect on its own payoff function. 

Previous COIN algorithms were based on varying the payoff functions {pr?} to optimize the 
signal/noise ratio reflected in the third term, subject to the requirement that the system be 
factored. To understand how those algorithms work, given a measure dfi(C v ), define the opacity 
at C of utility U as 




I d£J(C 1 0 


|[/(c)-t/(cyc,)l 


(3) 


where J is defined in terms of the underlying probability distributions. 3 

The denominator absolute value in the integrand in Eq. 3 reflects how sensitive U (C) is to changing 
^ . In contrast, the numerator absolute value reflects how sensitive {/(C) is to changing Cv So 
the smaller the opacity of a payoff function g v , the more g^iC) depends only on the move of 
player 77, i.e., the better the associated signal-to-noise ratio for 77. Intuitively then, lower opacity 
should mean it is easier for 77 to achieve a large value of its intelligence. 

To formally establish this, we use the same measure dfi to define opacity as the one that defined 
intelligence. Under this choice expected opacity bounds how close to 1 expected intelligence can 
be [42]: 

E{cu ( ( : rj) | s) < 1 - where 

K \s). ( 4 ) 

3 Writing it out in full, J{ C \ O = I I Cv 5 ) J 

, , , p(Cv \ <-„.s)p(c; 1 , p ( ci 1 1 

C»>.C I <-».«) = 5 + 2 ■ 
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So low expected opacity of utility g ^ is a necessary condition for the third term in Eq. 2 to have 
the desired form for player 77. While low opacity is not, formally speaking, a sufficient condition 
for E(tu { C : 7 ?) I to be close to 1, in practice the bounds in Eq. 4 are usually tight. 

In general it is not possible for a collective both to be factored and to have zero opacity for 
all of its players. However consider difference utilities, which are of the form 

U(0 = G{ 0 - r(/(0) (5) 

where T(f) is independent Cr?- Any difference utility is factored [42]. In addition, under usually 
benign approximations, E(Q U \ s) is minimized over the set of difference utilities by choosing 

T(/(0) = E(G I Oms) , (6) 

up to an overall additive constant. We call the resultant difference utility the Aristocrat utility 
(AU), loosely reflecting the fact that it measures the difference between a player’s actual action 
and the average action. 

If possible, we would like each player 77 to use the associated AU as its payoff function to ensure 
good from for both terms 2 and 3 in Eq. 2. This is not always feasible however. The problem is 
that to evaluate the expectation value defining its AU each player needs to evaluate the current 
probabilities of each of its potential moves. However if the player then changes its payoff function 
to be the associated AU it will in general substantially change its ensuing behavior. (The player 
now wants to choose moves that maximize a different function from the one it was maximizing 
before.) In other words, it will change the probabilities of its moves, which means that its new 
payoff function is in fact not the AU for its actual (new) probabilities. 

There are ways around this self-consistency problem, but in practice it is often easier to bypass 
the entire issue, by giving each 77 a payoff function that does not depend on the probabilities of 
77’s own moves. One such payoff function is the Wonderful Life Utility (WLU). The WLU for 
player 77 is parameterized by a pre-fixed clamping element CL n chosen from among 77’s possible 
moves: 


WLU^GiQ-Gi^CLO. (7) 

WLU is factored no matter what the choice of clamping element. Furthermore, while not match- 
ing the low opacity of AU, WLU usually has far better opacity than does a team game. 

In many circumstances one can loosely interpret a particular choice of clamping element for 
player 77 as equivalent to a “null” move for player 77, equivalent to removing that player from 
the system. (Hence the name of this payoff function — cf. the Frank Capra movie.) For such 
a clamping element assigning the associated WLU to 77 as its payoff function is closely related 
to the economics technique of “endogenizing a player’s externalities” [10]. However it is usually 
the case that using WLU with a clamping element that is as close as possible to the expected 
move defining AU results in far lower opacity than does clamping to the null move. Accordingly, 
use of such an alternative WLU almost always results in far better values of G than does the 
“endogenizing” WLU. 

Typically, COINs in which the payoff functions are WLU or AU not only far outperform 
team games, but also conventional function maximization techniques like simulated annealing. 
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However note that even if the payoff functions in the COIN result in the collective’s having every 
component of the vector Zq assuredly equal 1 — the best terms 2 and 3 we could hope for 
nothing in Eq. 2 precludes a poor value for G(Q. This is because having all those intelligences 
equal 1 only means that the collective is at a local maximum of G, not a global one. 

This potential shortcoming is reflected in the first term in Eq. 2, a term that does not directly 
depend on the choice of the players’ payoff functions. Crudely speaking, what that term reflects 
is the propensity of the system to get stuck in a local maximum. Accordingly, one can use many 
of the conventional exploration/exploitation function maximization techniques like simulated 
annealing to induce a good form for that term. In this hybrid, at each iteration, the exploration 
step is determined by the moves chosen by the players, rather than by using one of the random 
sampling schemes that are traditionally employed. The exploitation step though is the same as 
in the traditional formulation of the algorithm. In this way all three terms of Eq, 2 will have a 
desired form, and the induced G should be large. 

In its concern for all three terms this algorithm bears many similarities to well-run modern 
human corporations, with G the “bottom line” of the entire corporation, the players 77 iden- 
tified with the employees of the corporation, and the associated g v given by the employees’ 
performance-based compensation packages. For example, for a “factored corporation , each em- 
ployee’s compensation package contains incentives designed such that the better the bottom line 
of the corporation, the greater the employee’s compensation. In addition, if the compensation 
packages are “low opacity”, the employees will have a relatively easy time discerning the rela- 
tionship between their behavior and their compensation. Finally, the centralized exploitation 
process in C 0 C 0 is similar to the centralized decision-making of upper management that tries to 
determine whether to abandon or stick with a particular set of behaviors by the employees. It is 
due to these similarities that we call this algorithm the computational corporation algorithm. 


3 Details of the CoCo algorithm 

In the version of simulated annealing explored here, at the beginning of the exploration step of 
each time-step i, every player 77 changed its move from the one it settled on at the end of the 
preceding time-step, with probability .25. That 25% probability was uniformly allotted 

across all moves that differed from Then in the exploitation step, G was evaluated for the 

new point and compared to If the new point h a( I a higher G, it became the final point 

of the current time-step, Otherwise was chosen to be either the new point or C?,t- 1 , 
according to a temperature-parameterized Boltzmann distribution based on the two associated 
G values. 

In the CoCo variant of simulated annealing, the exploitation step was unchanged while the 
exploration step was modified to incorporate a COIN. Rather than have each player 77 pick 
an exploration move according to the simulated annealing distribution h(Cr?K that move was 
picked by sampling the distribution where the distribution c( 0 was generated by 

a reinforcement learning (RL) algorithm [8, 33, 46, 47] using a COIN-based utility function g v . 

The RL algorithm used was perhaps the simplest one possible. Each player 77 would maintain a 
running average of the reward it has received for each of its possible moves. (Those averages were 
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formed by exponentially weighting moves according to how long ago they were taken.) Those 
averages were then used to specify a Boltzmann distribution (parameterized by an associated 
“learning temperature”) over the possible moves. That distribution was sampled to decide 77 ’s 
next move. In these experiments, to form the initial averages, for the first 100 time-steps, the 
moves for all players were chosen completely at random rather than via RL, and the associated 
rewards recorded by each of the players. 

The AU version of CoCo was based on the very rough approximation that each player 77 
had a uniform probability distribution over its possible moves. Together with our choice of an 
extremely crude RL algorithm (and the resultant need to dedicate time-steps to a “thrashing” 
training period in which G does not improve), these approximations constituted a significant 
handicap for AU CoCo. 


4 Experimental Results 


Small world networks are rings in which a few random long-range links are added. They have 
been studied in variety of domains, including immunology, communication networks, social sci- 
ences, neuro physiology and information exchange [21, 23, 26, 27, 28, 38, 39]. It has been found 
that despite the fact that there are relatively few long-range links in such networks, those links 
drastically reduce the average smallest number of links connecting any two nodes. Accordingly, 
addition of those few links results in information propagating across such networks far more 
quickly. 


We wished to investigate whether the increased information speed accompanying those few 
extra links would improve the performance of a function-maximizer run on a function defined 
over the network, one whose maximization required coordination of the states across the entire 
network. In particular, we were interested in this issue when the function was an economic 
model, so that speeding up the function-maximization translated into more quickly settling into 
a desirable global economic state. 


The model we decided to investigate was one of coordination of choice of music-reproduction 
format across the players/nodes. More precisely, each player 77 on the network was given a choice 
of four objects, and the player's move was to choose three of those. The motivation was that 77 
would buy music-reproduction systems for those three formats only. Accordingly, 77 would only 
buy music in those three formats. This led to a reduction in the global price for those three 
formats, due to economies of scale in reproduction, which meant that all players using those 
formats saw a lower price. In addition, 77 was only able to exchange music in the three formats 
it had chosen with its neighbors lying < d links away. Finally, 77 had an intrinsic preference 
for each of the four formats, due to factors like their audio fidelity characteristics. These three 
effects combined to set each player 77 ’s “happiness” for a global configuration of all players’ format 
choices. In turn, G was the sum over all players of their happiness’: 


Nt 


N a 


«-E E*-, ■ E E prefj'i 




( 8 ) 


i=l \ \j= 1 / \j~ 1 keneighj f J 

where Nf and N a are the numbers of formats and players, respectively; neighj is the set of 
neighbors for player j (a maximum distance d away from player j); prefj ^ is the preference 
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of player j for format i, and was set randomly (to between 0 and 1) for each successive run; 

= 1 if player j has accepted format i and 0 otherwise; and = 1 if an player (j) and 

its neighbor (it) have accepted (and can therefore trade) the same format i, and 0 otherwise. 
The first parenthesis reflects the global popularity of a format in the network, while the second 
accounts for the popularity of a format in the local market (neighborhood) and individual player’s 
preferences. 

We compared simulated annealing, the computational corporation algorithm, and a simplified 
version of the standard economics approach of “endogenizing one’s externalities” . For CoCo, 
there were three variants: The first used team game utilities (CoCo -TG), the second used 
AU utilities with probabilities fixed to .25 for each of the four allowed moves (CoCo-AU), and 
the third was a Wonderful Life Utility with clamping parameter (1,1, 1,1) i.e., clamping was 
to the nominally illegal action of choosing all formats (CoCo-WLU). The last choice was the 
“endogenizing” -like approach of WLU with a clamping parameter of (0, 0, 0, 0), i.e., of choosing no 
formats at all (CoCo-Econ). For this last choice each player used a utility function of the marginal 
contribution of that player to the global utility: that player computes the difference between the 
world utility and the world utility when it abstains from participating in the exchange. 




Figure 1: Small worlds network with 100 players (neighborhood size = 1): (a) Short range (left); 
and (b) Long range connections. 

Figure 1 shows the performance at time step 200 for two variants of an underlying ring of 
100 players with 6 non-nearest-neighbor links superimposed. In the first variant all the extra 
links were of length 2, positioned randomly. In the second extra links were purely random, 
giving us a small- world network. All but team game CoCo algorithms significantly outperform 
simulated annealing. Among those, AU CoCo and “accept-all” CoCo (WLU) also outperform 
the economics based approach. Also note that simulated annealing did not benefit from the 
existence of long range connections. However, both the different CoCo algorithms, and the 
economics-based algorithm showed modest improvements (3%) in the presence of the long range 
connections, showing that these algorithms used the new information more efficiently than did 
simulated annealing. 

This experiment demonstrated that the existence of long range connections alone is not suf- 
ficient for the system to exhibit significant “small worlds” phenomena, and attain higher per- 
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formance. In particular, the model used above did not account for the “local neighborhood” 
(also known as path lengths [27]) which determines the distance over which a player can make 
a trade. Our second set of experiments incorporated this concept by modifying G to reflect 
the total number of other players within a given player 77 ’s full neighborhood (rather than just 
t/’s nearest neighbors). Figure 2 shows the performance of C 0 C 0 and SA in this new problem. 
Once again, the WLU and AU-based C 0 C 0 outperformed the economics-based C 0 C 0 , and all 
three outperformed SA. Also, notice that in this case, the presence of the long-range connections 
provided a marked improvement in the performance of all the algorithms (10%). 




Figure 2: Small worlds network with 100 players (neighborhood size = 3): (a) Short range (left); 
and (b) Long range connections. 

Figure 3 compares the convergence properties of C 0 C 0 WLU, CoCo-Econ and SA. The WLU- 
based algorithm converged to high G very rapidly, while CoCo-Econ did so more slowly. Simu- 
lated annealing, on the other hand, provided very poor G at t = 200. Projecting the convergence 
rate of SA linearly 4 provided over two orders of magnitude slower convergence than WLU-based 
algorithms to good values of G. Also note the clear difference in the quality of the exploration 
step between C 0 C 0 and SA: the exploration step of both C 0 C 0 algorithms improves dramatically 
over time due to the players learning “where to explore”. SA on the other hand has minimal 
improvement over the exploration space, which explains its slow convergence to good G. 


5 Conclusions 

There are three general types of parallel systems found in nature that can be viewed as engaging 
in maximization of a function G . These are exemplified by neo-Darwinian natural selection 
(for G that take any single one of the elements of the parallel system as an argument), spin 
glass relaxation (for G that take the entire system as argument), and clearing of markets in 
economics relaxation (for G that take the entire system as argument and in which the overall 
parallel system can be viewed as a non-cooperative game) . All three types of systems have been 
translated into computational algorithms, exemplified by genetic algorithms, simulated annealing, 

4 This favors SA since in practice its convergence rate slows down. 


10 





70000 


65000 

60000 

“O 

lw» 

| 55000 

oc 

1 50000 

O 

O 

45000 
40000 
35000 

0 20 40 60 80 100 120 140 160 180 200 

Time 

Figure 3: Small worlds network with 100 players (neighborhood size = 3). 
and computational markets, respectively. 

The Collective Intelligence framework can be viewed as an extension of conventional economics- 
based systems of the third type, to reflect signal-to-noise issues and greater freedom in modifying 
the individual players than exist in economies of human beings. It ha s traditionally been applied 
only to systems of the third type. Recent mathematical advances in that framework have shown 
that those traditional COIN algorithms only account for two of the three factors determining 
performance. The third factor can be accounted for by integrating the COIN with a technique 
of the second type, like simulated annealing. Intuitively, such an integrated system, which we 
call a computational corporation, can be viewed as conventional simulated annealing modified by 
having the value of each variable in the exploration step of the SA be set by a (computer-based) 
player in an associated non-cooperative game. Doing this allows the leveraging of the intelligence 
of such players to improve the exploration, and thereby improve the performance. 

We present experiments demonstrating that the computational corporation algorithm out- 
performs simulated annealing by two order of magnitude for a model of an economic process run 
over an underlying small- worlds topology. Furthermore, these experiments reveal novel small- 
worlds phenomena, and highlight the shortcomings of conventional mechanism design in bounded 
rationality domains. 

Acknowledgements: The authors thank Michael New for helpful comments. 
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